activation degree
On the Expressive Power of Deep Polynomial Neural Networks
We study deep neural networks with polynomial activations, particularly their expressive power. For a fixed architecture and activation degree, a polynomial neural network defines an algebraic map from weights to polynomials. The image of this map is the functional space associated to the network, and it is an irreducible algebraic variety upon taking closure. This paper proposes the dimension of this variety as a precise measure of the expressive power of polynomial neural networks. We obtain several theoretical results regarding this dimension as a function of architecture, including an exact formula for high activation degrees, as well as upper and lower bounds on layer widths in order for deep polynomials networks to fill the ambient functional space. We also present computational evidence that it is profitable in terms of expressiveness for layer widths to increase monotonically and then decrease monotonically. Finally, we link our study to favorable optimization properties when training weights, and we draw intriguing connections with tensor and polynomial decompositions.
On the Expressive Power of Deep Polynomial Neural Networks
We study deep neural networks with polynomial activations, particularly their expressive power. For a fixed architecture and activation degree, a polynomial neural network defines an algebraic map from weights to polynomials. The image of this map is the functional space associated to the network, and it is an irreducible algebraic variety upon taking closure. This paper proposes the dimension of this variety as a precise measure of the expressive power of polynomial neural networks. We obtain several theoretical results regarding this dimension as a function of architecture, including an exact formula for high activation degrees, as well as upper and lower bounds on layer widths in order for deep polynomials networks to fill the ambient functional space.
Activation thresholds and expressiveness of polynomial neural networks
Finkel, Bella, Rodriguez, Jose Israel, Wu, Chenxi, Yahl, Thomas
Polynomial neural networks are important in applications and theoretical machine learning. The function spaces and dimensions of neurovarieties for deep linear networks have been studied, and new developments in the polynomial neural network setting have appeared. In particular, results on the choice of the activation degree and the dimension of the neurovariety have improved our understanding of the optimization process of these neural networks and the ability of shallow and deep neural networks to replicate target functions [21, 27]. These theoretical results possess relevant implications. For appropriate datasets, polynomial activation functions can reduce model complexity and computational costs by introducing higher-order interactions between inputs, making it possible to model non-linear phenomena more efficiently. Moreover, polynomial neural networks have been found to perform well in practice in high-impact fields such as healthcare and finance.
- North America > United States > New York > New York County > New York City (0.04)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (2 more...)
On the Expressive Power of Deep Polynomial Neural Networks
Kileel, Joe, Trager, Matthew, Bruna, Joan
We study deep neural networks with polynomial activations, particularly their expressive power. For a fixed architecture and activation degree, a polynomial neural network defines an algebraic map from weights to polynomials. The image of this map is the functional space associated to the network, and it is an irreducible algebraic variety upon taking closure. This paper proposes the dimension of this variety as a precise measure of the expressive power of polynomial neural networks. We obtain several theoretical results regarding this dimension as a function of architecture, including an exact formula for high activation degrees, as well as upper and lower bounds on layer widths in order for deep polynomials networks to fill the ambient functional space.